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ABSTRACT 



Designers of clinical trials for Alzheimer's disease (AD) and mild cognitive impairment (MCI) are actively 
considering structural and functional neuroimaging, cerebrospinal fluid and genetic biomarkers to reduce 
the sample sizes needed to detect therapeutic effects. Genetic pre-selection, however, has been limited to 
Apolipoprotein E (ApoE). Recently discovered polymorphisms in the CLU, CR1 and PICALM genes are also 
moderate risk factors for AD; each affects lifetime AD risk by -10-20%. Here, we tested the hypothesis that 
pre-selecting subjects based on these variants along with ApoE genotype would further boost clinical trial 
power, relative to considering ApoE alone, using an MRI-derived 2-year atrophy rate as our outcome mea- 
sure. We ranked subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI) based on their cumu- 
lative risk from these four genes. We obtained sample size estimates in cohorts enriched in subjects with 
greater aggregate genetic risk. Enriching for additional genetic biomarkers reduced the required sample 
sizes by up to 50%, for MCI trials. Thus, AD drug trial enrichment with multiple genotypes may have potential 
implications for the timeliness, cost, and power of trials. 

© 2013 The Authors. Published by Elsevier Inc. All rights reserved. 



1. Introduction 

Dementia is a devastating disease, with costs exceeding those of 
cancer and heart disease (Hurd et al., 2013). There is a crucial need 
for Alzheimer's disease treatments that go beyond merely alleviating 
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symptoms of this devastating disease. Powerful clinical trial designs 
are vital to novel drug development. In recent years, multiple ap- 
proaches using neuroimaging, biochemical and genetic biomarkers 
have been proposed for enrichment of AD trials, i.e., reducing the 
number of subjects needed to detect potential drug effects. By 
enriching a sample in subjects who are more likely to cognitively de- 
cline, there is a greater change for the treatments to resist, and an 
expected gain in power. Structural MRI measures, for instance, are 
useful not only for measuring brain atrophy (Frisoni et al., 2010; 
Hua et al., 2011; Leow et al., 2006; McEvoy et al., 2010) but also for 
predicting brain atrophy, which can be used to pre-select subjects 
who will decline in a clinical trial (Lorenzi et al., 2010; McEvoy et 
al., 2010). PET scanning with FDG or PiB, cerebrospinal fluid (CSF) 
measures of Abeta42, tau and phosphorylated tau, as well as 
genotyping for the Apolipoprotein E (ApoE) s4 allele (ApoE4) have 
also been proposed to pre-select people for clinical trials, reducing 
sample size requirements (Lorenzi et al., 2010; McEvoy et al., 2010). 
Multivariate methods can also combine multiple biomarkers to select 
subjects more likely to show detectable change in hypothetical AD 
trials (Kohannim et al., 2010). 
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ApoE is the greatest known genetic risk factor for late-onset AD 
(Bertram et al., 2007) and ApoE genotyping has already been studied 
for pre-selecting individuals for potential AD trial enrichment (Beckett 
et al., 2010; Kohannim et al., 2010; McEvoy et al., 2010; Nestor et al., 
2008; Risacher et al., 2010). Carriers of at least one copy of the ApoE4 
variant make up -20% of the normal elderly population (e.g., Crivello 
et al., 2010), and each E4 allele carried roughly triples a person's lifetime 
risk of developing AD. Even so, ApoE does not account for all of the 
inherited risk for AD. As other risk genes are discovered and validated, 
genotyping should even further reduce sample size requirements. 
Two recent genome-wide association (GWA) studies recently implicat- 
ed several highly prevalent genetic variants in the clusterin (CLU), 
phosphatidylinositol binding clathrin assembly protein (PICALM) and 
the complement component receptor 1 (CR1) genes, beyond the well- 
known ApoE gene; carriers of risk-conferring variants in any of these 
genes have an additional 10-20% increased risk for Alzheimer's disease, 
if other risk factors are equal (CLU and CR 1 ; Lambert et al., 2009, CLU and 
PICALM; Harold et al., 2009). Since then, several studies, including meta- 
analyses, have replicated these findings in large cohorts of healthy con- 
trols and pathologically confirmed, late-onset AD patients (Carrasquillo 
et al., 2010; Corneveaux et al., 2010; Jun et al., 2010). Most recently, 
Ferrari et al. (2012) also verified the associations of these genetic vari- 
ants with AD by sequencing the respective genes' coding regions. 
Though the precise role of these specific variants remains unknown, 
there are several links between the protein products of CLU, PICALM, 
and CR1 and AD pathogenesis: CLU with beta-amyloid aggregation, 
deposition, and clearance, PICALM with synaptic function, clathrin- 
mediated endocytosis, and amyloid precursor protein recycling, and 
CRt with inflammation and complement-mediated beta-amyloid clear- 
ance (Holton et al., 2013; Jun et al., 2010; Sleegers et al., 2010). 

Here, we assessed how the 3 recently discovered AD risk genes 
(i.e., PICALM, CLU and CRJ), along with ApoE, combine to affect 
brain tissue loss rates in Alzheimer's Disease Neuroimaging Initiative 
(ADN1) subjects. We hypothesized that a genetic score derived from 
these top AD risk genes would enrich a hypothetical 2-year clinical 
trial beyond just selecting ApoE4 carriers. We focused on MCI and 
healthy controls, similarly to prior enrichment studies (Lorenzi 
et al., 2010; McEvoy et al., 2010), mainly due to the crucial need for 
early interventions in AD (Kozauer and Katz, 2013). 

2. Methods 

2.1. Subjects 

Subjects considered for this study were participants in the first 
phase of the ADNI project, also known as "ADNI1" (Weiner et al., 
2012). ADNI was launched as a public-private partnership in 2004 
by the N1H, private pharmaceutical companies, and non-profit organi- 
zations, The goal of ADNI is to determine biological markers of 
Alzheimer's disease through neuroimaging, genetics, neuropsycho- 
logical tests and other measures, to develop and monitor new thera- 
pies, and reduce the time of clinical trials. Subjects were recruited 
from 58 sites across North America. The study was conducted 
according to the Good Clinical Practice guidelines, the Declaration of 
Helsinki, and U.S. 21 CFR Part 50— Protection of Human Subjects, 
and Part 56— Institutional Review Boards. Written informed consent 
was obtained from all participants before protocols were performed. 
All data acquired as part of this study are publicly available (http:// 
adni.loni.ucla.edu/). 

All ADNI subjects underwent thorough clinical and cognitive as- 
sessment at the time of scan acquisition to establish diagnosis. In 
this study, all subjects with both baseline 1.5-Tesla scans and longitu- 
dinal follow-up scans at 24 months were assessed. The 24-month 
follow-up was picked, as 2 years is a fairly typical duration for an 
MCI or AD clinical trial. All subjects with available 24-month scans 
and genotypes were included. The mini-mental state exam (MMSE) 



was administered to provide a global measure of mental status 
(Cockrell and Folstein, 1988). The Clinical Dementia Rating (CDR) 
was used to assess dementia severity (Morris, 1993). Healthy volun- 
teer status was determined if a subject had MMSE scores between 
24 and 30 (inclusive), a CDR of 0, and was non-depressed, non-MCI, 
and non-demented. Mild cognitive impairment (MCI) diagnosis 
was determined if a subject had MMSE scores between 24 and 30 
(inclusive), a memory complaint, objective memoiy loss measured 
by education adjusted scores on the Wechsler Memoiy Scale Logical 
Memoiy II, a CDR of 0.5, absence of significant levels of impairment 
in other cognitive domains, essentially preserved activities of daily 
living, and an absence of dementia. AD was diagnosed based on the 
National Institute of Neurological and Communicative Disorders and 
Stroke and the Alzheimer's Disease and Related Disorders Association 
(NINCDS-ADRDA) criteria for probable AD (McKhann et al., 1984), 
MMSE scores between 20 and 26 (inclusive), and CDR of 0.5 or 1.0. 
Definitive autopsy-based diagnosis of AD was not possible. 

Here, we studied 500 ADNI subjects (mean age of 75.5 ± 6.5 SD; 
296 males and 204 females; see Table 1 for full demographics) with 
available genotypes and scans at baseline and 24 months after diag- 
nosis. All subjects are Caucasian, identified by self-report, and con- 
firmed with multi-dimensional scaling analysis (Stein et al., 2010). 

2.2. Imaging 

MR1 scans of the brain were obtained from ADNI subjects with a stan- 
dard 1.5 T protocol. A sagittal 3D MP-RAGE sequence was used, that had 
been optimized for consistency across sites (Jack et al., 2008; Leow et al., 
2006; TR/TE = 2400/1000 ms; flip angle = 8°; FOV = 24 cm; final 
reconstructed voxel resolution = 0.9375 x 0.9375 x 1.2 mm 3 ). Scans 
at 12 and 24 months were linearly registered to baseline scans and 
aligned to a template after standard corrections, which include 
Gradwarp (Jovicich et al., 2006), Bl-correction (Jack et al., 2008), bias 
field correction (Sled et al., 1998), and phantom-based geometrical scal- 
ing (Gunter et al., 2006). Masks excluding non-brain tissues were 
obtained using a robust brain extraction tool (ROBEX; Iglesias et al., 
2011). A bias-free tensor-based morphometry analysis (TBM; Hua 
et al., 2013; Leow et al., 2006) was used based on a non-linear inverse 



Table 1 

Number of subjects, distribution of sex and age (mean ± SD years), number of carriers 
(C) and non-carriers (NC) for the ApoE4 risk allele, and distribution of alleles for CLU 
(rsll136000), PICALM (rs3851179) and CR! (rs3818361) AD risk-conferring polymor- 
phisms are displayed for subjects with Alzheimer's disease (AD), mild cognitive im- 
pairment (MCI), healthy controls (CTL), and also for all 500 ADNI subjects combined. 
These 500 subjects were studied, as they had available 24-month as well as baseline 
MR1 scans. Minor allele frequencies (MAFs) and Hardy-Weinberg Equilibrium (HWE) 
p-values for the candidate variants are also shown in the last column. The risk allele 
for CR1 is the minor allele (i.e., T), but the risk alleles for CLU and PICALM are the 
major alleles (i.e., C and G, respectively). 





AD 


MCI 


CTL 


All 


MAF/HWE p 


N 


106 


234 


160 


500 




Age 


75.5 ± 7.4 


74.9 ± 6.9 


76.3 ± 4.9 


75.5 ± 6.5 




Sex 


50 F 


79 F 


75 F 


204 M 






56 M 


155 M 


85 M 


296 F 




ApoE4 


73 C 


128 C 


45 C 


246 C 






33 NC 


106 NC 


115 NC 


254 NC 




CLU 


36C/C 


95 C/C 


61 C/C 


192 C/C 


0.38/0.84 




54C/T 


104 C/T 


76 C/T 


234 C/T 






16T/T 


35 T/T 


23 T/T 


74 T/T 




PICALM 


44G/G 


102 G/G 


58 G/G 


204 G/G 


0.36/0.69 




50 G/A 


102 G/A 


82 G/A 


234 G/A 






12 A/A 


30 A/A 


20 A/A 


62 A/A 




CRJ 


64C/C 


153 C/C 


119 C/C 


336 C/C 


0.18/0.76 




38 C/T 


72 C/T 


39 C/T 


149 C/T 






4T TT 


9T/T 


2 T/T 


15 T/T 





The allele frequency distributions between AD, MCI. and CTL did not show any statisti- 
cally significant differences in CLU (/ 2 = 1.55, p = 0.82), PICALM (% 2 = 2.60, p = 
0.63), or CR1 (% 2 = 7 A3, p = 0.11). 
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consistent elastic registration that treats the baseline and follow-up 
scans identically, and symmetrically (Leow et al., 2005). In voxelwise 
analysis, the volumetric change over time (over a 2-year follow-up) 
was regressed against the joint effect of genetic variants at each voxel, 
using multiple linear regression. Correction for multiple spatial compar- 
isons was performed with a regional false discovery rate (FDR) method 
(Langers et al., 2007). In power analyses, a 2-year atrophy rate was com- 
puted from a statistically defined region-of-interest to summarize tem- 
poral lobe atrophy (Hua et al., 201 1 ). 

2.3. Genetic risk score calculation 

The ADNI genotyping procedures are thoroughly described in 
Saykin et al. (2010). For this study, a genetic risk score was calculated 
from previously identified and replicated top SNPs in CLU 
(rslll36000),P;C4LM(rs3851179) and CRI (rs3818361) using a lin- 
ear model of aggregate disease risk, with weights for each variant 
equal to the logarithm of odds ratios reported in a recent meta- 
analysis by Jun et al. (2010), which is in agreement with odds ratios 
reported in other studies (Harold et al., 2009; Lambert et al., 2009). 
Odds ratios for the minor alleles of CLU, PICALM, and CRI were there- 
fore set to 0.9, 0.9 and 1.1, respectively. The odds ratio for carrying 
ApoE4 was set to 3. This is an underestimation as odds ratios are 
higher in ApoE e4/e4 subjects (Farrer et al., 1997) who are also in- 
cluded here, but any larger value would yield the same ranking of 
subjects as the effect of ApoE4 overrides that of all other candidate 
SNPs (in other words the score assigned to ApoE4 is so large that 
the exact value of the score does not affect the eventual ranking). 
The score calculation and ranking procedure is further detailed in 
the Supplementary material and Inline Supplementary Table SI. 

Inline Supplementary Table SI can be found online at http://dx. 
doi.org/1 0.1 01 6/j.nicl.201 3.05.007. 

In voxelwise analysis, multiple linear regression was used to predict 
volumetric differences at each voxel based on candidate genotypes, 
after adjusting for covariates (sex, age, and population structure param- 
eters derived from multi-dimensional scaling analysis (MDS) as de- 
tailed in Hibar et al., 2011; Stein et al., 2010). This multiple regression 
analysis was more stable than regressing the same aggregate risk 
score against the signal at each voxel, as the variants' odds ratios from 
case-control studies may not necessarily correspond to the appropriate 
weights for predicting volume differences at each voxel. 

2.4. Power analysis 

As in prior studies (Hua et al., 2011; Kohannim et al., 2010), sam- 
ple size estimates were calculated according to the following formula: 



2(7 \Z- i _ a i2 + ZpowerJ 

(O.250) 2 



Here a and fi denote the mean and standard deviation of the atro- 
phy rates respectively, a is set to be nominal 0.05, and the desired 
power is 80%. Subjects were ranked according to their composite ge- 
netic score to determine how much the sample size estimates would 
be reduced if the study was restricted to those with higher-risk 
scores. The ranking procedure was randomized 1000 times to com- 
pare observed power results with those obtainable by chance, i.e., 
when just selecting subsets of people from the overall sample with- 
out regard to their genotypes. 

3. Results 

We first studied the sample size enrichment effect using genotype 
scores for a mixed group of MCI and cognitively healthy control sub- 
jects, in the proportions they were enrolled in ADNI (i.e., roughly 2 



MCI subjects for each healthy control subject). We ranked the 394 
subjects by their cumulative risk score based on ApoE4 carrier status 
and their CLU, CRI and PICALM genotypes. We related our risk ranking 
to the sample size for a potential clinical trial needed to detect a 25% 
reduction in atrophy rates within two years with 80% power. The 
temporal lobe atrophy rate we used for our sample size calculations 
was significantly correlated with 2-year change in sum-of-boxes Clin- 
ical Dementia Rating (CDR-SOB) scores (R 2 = 0.25, p < 2 x 10~ 16 ; 
see Inline Supplementary Fig. SI for details). We observed boosting 
of trial power by selecting those at greater genetic risk (Fig. 1 ). Sam- 
ple size estimates for a trial evaluating temporal lobe atrophy were 
reduced from 142 to 94 by selecting ApoE4 carriers, and to even 
smaller estimates in subjects with more risk alleles (69 when 
selecting only the top -15% of the subjects, and 60 when selecting 
only the top -10% of the subjects who had the greatest aggregate 
risk). These cutoffs correspond to risk scores of 1.08 and 1.10, respec- 
tively (see Inline Supplementary Table SI ). We ensured this was not 
due to chance through 1000 permutations: all subjects were ranked 
randomly and sample size estimates were computed for random sub- 
sets (Fig. 1). When we coded ApoE as 0, 1 or 2 for the number of e4 
alleles (instead of 0 for carrying no copies of s4 and 1 for carrying 
any number of copies of e4), boosting was still observed in a similar 
way (Inline Supplementary Fig. S2). 

Inline Supplementary Figs. SI and S2 can be found online at http:// 
dx.doi.org/1 0.1 01 6/j.nicl.201 3.05.007. 

Next, we tested whether the boosting of trial power also applies to 
MCI subjects and cognitively healthy controls, when considered sep- 
arately. Reduced sample sizes were observed in MCI subjects and 
were consistently lower than permuted estimates (Fig. 2A). In car- 
riers of at least one copy of the risk allele, ApoE4, the sample size es- 
timate was reduced to 67 (from 105 when considering all MCI 
subjects). Selecting the top -20% and -10% of high-risk MCI subjects 
yielded estimates of 58 and 50, respectively. These cut-offs corre- 
spond to the same risk score thresholds as those considered above, 
which led to reduced estimates of 69 and 60. In cognitively healthy 
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Fig. 1. Sample size estimates (n80) are shown for a hypothetical clinical trial with a 
mixture of healthy control and mild cognitive impairment (MCI) subjects (in the pro- 
portions enrolled by ADNI), as a function of number of subjects (N) after ranking sub- 
jects according to a cumulative genetic risk score derived from ApoE4 (coded as 0 or 1 
for carrier status) and single nucleotide polymorphisms in CLU, CRI and PICALM (blue). 
The breakpoints represent different risk score values, as these are not continuous (see 
Inline Supplementary Table SI for a list of risk scores and corresponding allele combi- 
nations). Permutations are performed by randomizing the ranking procedure 1000 
times and calculating n80 estimates for each N (all permuted estimates are shown in 
black). ApoE4 reduces n80s from 142 to 94, based on data from a subset of 173 subjects 
(44% of all MCI and controls). Using the cumulative genetic risk score, this is further re- 
duced to 69 when pre-selecting the top -15%, and 60 when pre-selecting the top -10% 
of the subjects. 
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Fig. 2. Sample size estimates (n80) are shown for hypothetical clinical trials as a func- 
tion of number of subjects (N) after ranking subjects according to a cumulative genetic 
risk score derived from ApoE4 (coded as 0 or 1 for carrier status) and single nucleotide 
polymorphisms in CLU, CR1 and PICALM (blue). The breakpoints represent different risk 
score values, as these are not continuous (see Inline Supplementary Table SI for a list 
of risk scores and corresponding allele combinations). Permutations are performed by 
randomizing the ranking procedure 1000 times and calculating n80 estimates for each 
N (all permuted estimates are shown in black). Panel A shows results for MCI subjects. 
Selection of subjects who cany at least one copy of the risk allele, ApoE4 (-55% of all 
MCI subjects), reduces n80s from 105 to 67. This is further reduced from the remaining 
three risk variants to 58 in the top -20% and to 50 in the top -10% of MCI subjects, re- 
spectively. Permutations consistently provide worse estimates. Panel B displays similar 
results for cognitively healthy subjects only. Selecting ApoE4 carriers (28% of all con- 
trols) reduces n80 minimally from 120 to 117. Selecting the top -10% of control sub- 
jects with highest genetic risk reduces n80 to 55, still falling short of significance 
when compared to random permutations (p - 0.11). 



subjects, a consistent pattern of reduced sample sizes from higher 
genetic risk scores was not observed (Fig. 2B). Selection of ApoE4 
carriers reduced the sample size estimate only from 120 to 117, 
which was not significant. Selecting the top -10% of control subjects 
with highest genetic risk reduces this further to 55, but this reduction 
still fell short of significance when compared to permutations 
(p~0.ll). 

We then sought to statistically explore how much CLU, PICALM, 
and CR1 contribute to the reduction in sample size, independently 
of ApoE. We performed 1000 permutations as above, but instead 
started with ApoE carriers and permuted within this subgroup 
only to see if further n80 reductions from the aggregate genotype 
risk score were statistically significant. In the mixture of MCI and 



cognitively normal ApoE carriers, boosting of n80 from 94 to 69 by 
selecting the subjects with risk scores above 1.08 (top -15% in 
above analysis) was statistically significant (p = 0.035). Selecting 
those with scores above 1.10 (top -10% above), which reduced n80 
from 94 to 60 was suggestive, but did not reach significance (p = 
0.070). When we considered MCI ApoE carriers alone, neither reduc- 
tions from 67 to 58 and 50 were statistically significant (p = 0.20, 
and p = 0.14, respectively). 

Although we were mostly interested in MCI and healthy control 
subjects due to the importance of early intervention in clinical trials, 
we also investigated a similar enrichment strategy within the whole 
cohort of 500 subjects, including those with AD. Similar results were 
obtained with the inclusion of AD subjects. Selecting the subset of 
246 ApoE carriers reduced the sample size from 135 to 84. The sample 
sizes were further reduced to 65 and 59 by selecting subsets of 
subjects with high genetic risk scores with the same two score 
breakpoints used above. Permutations confirmed statistical signifi- 
cance. Boosting in the full cohort - and corresponding permutation 
results - are shown in Inline Supplementary Fig. S3. 

Inline Supplementary Fig. S3 can be found online at http://dx.doi. 
org/1 0.1016/j.nicl.2013.05.007. 

When regressed against the 24-month atrophy rate based on the 
temporal lobe statistical ROI in all 500 ADNI subjects, our 4-gene 
risk score was statistically significant in the expected direction 
(/3 = — 1.43; p < 2 x 10~ 16 ) after adjusting for age, sex, and pop- 
ulation structure. When ApoE was not included in the score, but in- 
cluded as a covariate instead, the 3-gene risk score association was 
still in the expected direction, but no longer statistically significant 
(0 = — 1.17; p = 0.11). In a post-hoc analysis, we assessed whether 
there were regions in our voxelwise atrophy maps where the 
3-gene association was statistically significant. The cumulative effect 
of CLU, CRI and PICALM SNPs was regressed against voxelwise mea- 
sures of volumetric change over 2 years in maps of the temporal 
lobes. As explained in the Methods section, we used multiple linear 
regression with the 3 SNPs as predictors here, instead of a score 
based on the case-control odds ratios, which do not necessarily re- 
flect the effect of each SNP on structural brain atrophy at each 
voxel. After adjusting for ApoE status and other covariates (i.e., sex, 
age and population structure) and correcting for multiple compari- 
sons across voxels, we found statistically significant spatial effects in 
the temporal lobes for 500 subjects with the rate of volumetric 
brain change 24 months after the first scan, mostly in the entorhinal 
cortex and hippocampus (Fig. 3). In a separate analysis, we also 
added diagnostic status (AD, MCI, or control) as another covariate, 
and obtained similar, in fact stronger, results (Fig. 4). 

4. Discussion 

In this study, we showed that an aggregate genotype risk score de- 
rived from variants in the CLU, PICALM and CRI risk genes for 
Alzheimer's disease can be used to enrich a hypothetical clinical 
trial through genetic profiling. Prior studies show successful enrich- 
ment of AD and MCI trials with ApoE4 genotyping (Kohannim et al., 
2010; McEvoy et al., 2010) and ApoE's significant effect on brain atro- 
phy, independently of clinical diagnosis (Potkin et al., 2009). To our 
knowledge, no previous study has investigated whether we can fur- 
ther reduce sample sizes by including more risk variants to select 
the sample. 

We first showed that selecting subjects who carry at least one 
copy of ApoE4 reduced the sample size required to detect an 80% at- 
rophy rate in 2 years by about 34% in a group of MCI and healthy con- 
trol subjects and 36% in MCI subjects only. This reduction is consistent 
with McEvoy et al.'s (2010) reported 10-43% reduction in sample size 
requirements when using ApoE4 to pre-select a sample. Enrichment 
was not explored for cognitively healthy controls in the McEvoy 
study, but we did not find a significant sample size reduction for 
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Fig. 3. Single nucleotide polymorphisms in CLU, P1CALM and CRI are jointly regressed against voxelwise, 3D maps of 24-month temporal lobe atrophy in 500 ADNI subjects with 
available scans. Sex, age, ApoE status and population structure are adjusted for in each regression. Multiple comparisons across voxels are corrected with a regional FDR method; 
only voxels that survive this statistical correction are shown in color. Representative axial and sagittal slices are shown, which show the statistically significant additive, bilateral 
and rather symmetric effects of the variants in the hippocampus and entorhinal cortex. Warmer colors represent more significant effects. Images are in radiological convention. A 
total of 15,711 voxels survive the p < 0.05 threshold with a p mirl of 1.04 x 10~ 4 . The regional (searchlight) FDR technique renders 3130 voxels significant (i.e., with corrected 
p-values below 0.05). 



ApoE carriers in this group. This is not surprising, as average atrophy 
maps show a low-grade and diffuse pattern of atrophy in cognitively 
healthy controls, whereas effects are stronger and more concentrated 
in regions such as the temporal lobes in AD and MCI subjects. Prior 
studies have uncovered significant differences in structural and func- 
tional imaging measures such as cortical thickness (Burggren et al., 
2008) and beta amyloid (Morris et al., 2010) in cognitively healthy 
ApoE carriers, but the effects are either absent in classic AD brain 
areas such as the hippocampus, or they appear to be diffuse across 
several brain regions. Numerical summaries designed to capture lo- 
calized atrophy in the classic AD brain areas (trained by AD subject 
scans) may, therefore, not work well to capture atrophy in healthy 
aging subjects (Hua et al., 2013). 

In our study, selecting subjects who, in addition to carrying ApoE4, 
carried high-risk combinations of the CLU, CRI and PICALM polymor- 
phisms led to an additional reduction of 51-58% in sample size in a 
mixed group of MCI and controls and 45-52% in MCI subjects only. 
These were statistically significant when compared to a randomized 
subject selection procedure, though when considered strictly inde- 
pendently from ApoE, statistical significance was limited to the 
mixed group of MCI and controls with the 51% reduction. In cogni- 
tively healthy controls, a 56% reduction was observed through 
selecting the top 10% of high-risk subjects, but did not reach signifi- 
cance. A simple explanation may be that we need larger sample 
sizes for the healthy controls to demonstrate significant boosting of 
trial power, whereas we have fewer control subjects than MCI in 
our study. However, the discrepancy between the findings in the 
two diagnostic subgroups may also suggest that clinical trials might 
benefit from either including MCI subjects only, or enrolling a 



mixture of MCI and healthy controls (i.e., enrollment regardless of 
clinical diagnosis) who instead have higher genetic risk scores. En- 
rolling MCI subjects for AD trials is not without its limitations, as 
criteria for MCI may lead to missing AD converters or including con- 
trols ( Visser et al., 2005). Nonetheless, this very limitation may be al- 
leviated by further excluding subjects with lower genetic risk. In 
ADNI, as in most other neuroimaging studies, autopsy confirmation 
of AD diagnosis was not possible, and at least in principle, CSF bio- 
markers (for tau protein, etc.) could have been used to corroborate 
the clinical diagnosis, which was made according to the NINCDS- 
ADRDA criteria. CSF measures of Tau, P-Tau, and A-beta are available 
for only a subset of the ADNI subjects who have longitudinal data, so 
we did not use them here to further inform the diagnosis. Also, it 
must be conceded that some MCI participants might harbor AD pa- 
thology and might be considered as having AD in its initial stages. 

We further reported regionally significant aggregate effects of CLU, 
PICALM and CRI variants on TBM-derived three-dimensional maps of 
temporal lobe atrophy. The use of a cumulative genetic risk score has 
substantial practical consequences, beyond the standard practice of 
focusing on ApoE as the dominant risk gene for late-onset AD. The 3 
more recently discovered AD risk genes have been previously associ- 
ated with quantitative, neuroimaging summary measures (reviewed 
in Braskie et al., 2011a) in addition to their high degree of replication 
as AD risk genes in case-control studies (Carrasquillo et al., 2010; 
Ferrari et al., 2012; Harold et al„ 2009; Jun et al„ 2010; Lambert et 
al„ 2009). Significant association of PICALM with MRI-derived hippo- 
campal volume and entorhinal cortex thickness, and that of CRI on 
entorhinal cortex thickness have been recently reported (Biffi et al., 
2010). Another study found an association of CRI with entorhinal 
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Fig. 4. Single nucleotide polymorphisms in CLU, PICALM and CRI are jointly regressed against voxelwise, 3D maps of 24-month temporal lobe atrophy in 500 ADNI subjects with 
available scans. Diagnostic status (i.e., either AD, MCI, or control) in addition to sex, age, ApoE and population structure is adjusted for in each regression. Multiple comparisons 
across voxels are corrected with a regional FDR method; only voxels that survive this statistical correction are shown in color. Representative axial and sagittal slices are shown, 
as in Fig. 3, indicating the statistically significant additive effects of the variants in the hippocampus and entorhinal cortex. Warmer colors represent 'more significant' effects. Im- 
ages are in radiological convention. A total of 23,202 voxels survive the p < 0.05 threshold with a p min of 2.51 x 10~ 5 . A subtotal of 6114 voxels survive the corrected p-value 
threshold of 0.05 with the regional (searchlight) FDR technique. 
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cortex volume in young, healthy adults (Bralten et al., 2011). Al- 
though the association of our 3-SNP risk score with the average atro- 
phy rate across the temporal lobe fell short of significance, our results 
provide a voxelwise complement to these studies, as we found spa- 
tially significant effects in temporal lobe regions that are in agree- 
ment with their findings. The rsl 1136000 variant in CLU, though 
not significant in Biffi et al.'s study, has been associated with micro- 
structural white matter integrity using diffusion tensor imaging in 
young, healthy adults (Braskie et al., 2011b). Although the ADNI co- 
hort was included in the Jun et al. (2010) meta-analysis, the three 
SNPs have been discovered and replicated in multiple, large cohorts 
that do not overlap with ADNI and there is, therefore, no circularity 
in our genetic results; in other words, the SNPs chosen were based 
on their association with AD in many other samples, and not because 
they were associated with brain atrophy in the ADNI dataset 
specifically. 

The outcome measure we implemented here for performing our 
power analyses was based on numerical summaries of temporal 
lobe atrophy derived from tensor-based morphometry of MRI scans 
(Leow et al., 2006). Neuroimaging measures can boost power for clin- 
ical trials (Leung et al., 2010; McEvoy et al., 2010; Nestor et al., 2008; 
Risacher et al., 2010). As detailed in Hua et al. (2009), we used a sta- 
tistically defined region-of-interest in the temporal lobes, which can 
be a more powerful outcome measure for tracking atrophy than a 
standard voxelwise average. This region-of-interest was obtained 
from a training set of 20 AD subjects. Since our enrichment analyses 
focused on MCI and control subjects, overfitting due to training and 
testing with overlapping subjects was avoided. Despite the use of a 
sophisticated imaging-based outcome measure, we demonstrate its 
correlation with clinical dementia scores, and expect that our findings 
with respect to sample size reduction using multiple AD risk genes 
should generalize to trials with purely clinical outcome measures as 
well, as these are top SNPs from case-control genome-wide associa- 
tion studies (Harold et al., 2009; Lambert et al., 2009). 

A genotype-based enrichment approach may offer several poten- 
tial advantages for AD drug trials. As discussed in Schork and Topol 
(2010), trials for both prevention and therapy of disease may benefit 
from such enrichment. For prevention, subjects with high-risk geno- 
types (e.g., ApoE4 carriers with multiple CLU, PICALM and CRl risk al- 
leles in our study) may be more likely to respond to a preventive 
agent, thereby making trials more cost-effective and efficient. The 
presence of subjects in a trial who are unlikely to decline at all 
makes it difficult to pick up drug effects. Though our multilocus en- 
richment approach is related to that of other AD enrichment studies, 
where neuroimaging and cerebrospinal fluid biomarkers are shown 
to help select at-risk subjects, it is particularly helpful as it may be 
less costly and can be obtained from subjects quickly and efficiently, 
and potentially avoid issues such as patients' willingness to undergo 
MRI or a lumbar puncture. Although several of the sample size reduc- 
tions we obtained from our candidate genes were not significant after 
adjustment for ApoE alone, the general risk score approach, perhaps 
fine-tuned with additional genes and larger numbers of subjects, 
has the potential to offer such advantages. 

A genotype profiling enrichment strategy is also promising for 
therapeutic trials, as risk genes often reflect molecular mechanisms 
of disease and, arguably, may affect subjects' degree of responsive- 
ness to specific therapeutics. The protein products of CLU, CRl, and 
PICALM are linked to inflammatory, amyloid, lipid and chaperone 
pathways that are implicated in neurodegeneration (Sleegers et al., 
2010). Genetic profiling based on these candidate genes may 
help with AD trial enrichment. This overall approach is related to 
pharmacogenetic sampling (i.e., selecting genetic subgroups of pa- 
tients who are more likely to respond to treatment; Schork and 
Topol, 2010). However, the main premise is to avoid sampling people 
who will not decline during the trial, and preferentially sample those 
who will. This sampling may also be performed early, which may help 



in cases where therapeutics work better in the earlier stages of a dis- 
ease. Such pharmacogenetic designs have already been explored with 
ApoE in the context of AD trials (Roses, 2009; Zamani et al., 201 1 ) and 
hinted at in mouse model studies (Cramer et al., 2012). 

Designing clinical trials based on genotypes also has some chal- 
lenges and limitations. Screening out subjects with certain genotypes 
may even miss trial effects, particularly if it is uncertain how the risk 
gene affects responses to therapeutic agent. In theory at least, people 
at low genetic risk may respond better to a treatment; regardless of 
this, however, their response may be extremely challenging to detect 
if they decline less than others, when not treated. Similarly, our study 
is limited to four AD risk genes. The three candidate SNPs we included 
in addition to ApoE are top GWAS findings, with recent replications in 
very large cohorts (see Introduction). There are, however, other AD 
risk variants that might potentially be included in a risk score for 
trial enrichment, particularly once they are further validated. Genetic 
variants in BIN1 (Kingwell, 2013; Lambert et al., 2011) and those in 
ABCA7, EPHA1, CD33, CD2AP, and MS4A6A/MS4A4E (Hollingworth et 
al., 2011), for instance, were recently discovered in large AD GWAS 
studies, some of which may have roles in AD pathogenesis (Holton 
et al., 2013). Genetic variants in TOMM40 have also been linked to 
AD (Roses et al., 2010), though the proximity and linkage disequilibri- 
um between this gene and ApoE led to recent investigations that failed 
to replicate this association and cast doubt on ApoE-independent risk 
signals in the chromosomal region (Cruchaga et al., 2011; Jun et al., 
2012). In practice, our approach may be too selective for a clinical 
trial, but nonetheless serves as a proof-of-principle study to demon- 
strate enrichment using multiple genotypes. 

Another limitation of our study is that we did not include gene- 
gene interactions in our model. As the first paper to consider genes 
beyond ApoE for clinical trial enrichment, we started with a linear, 
additive model, modeling SNP effects based on well-recognized, 
GWAS-validated odds ratios. Some interactions between our candi- 
date genes, particularly those between P/G4LM and ApoE have been 
reported in prior work (Jun et al., 2010). However, they have not al- 
ways been replicated. The first GWAS that identified PICALM in AD 
(Harold et al., 2009), in fact, failed to find such interaction, making ef- 
fect size estimation for interaction terms more complex. In addition, 
we may be underpowered to pick up interaction effects, as they as- 
sume a main effect is present, and that modulatory effects on the 
main effect can be detected. Truly vast samples, as well as novel com- 
putational strategies, may be needed to detect interactions among all 
pairs of N SNPs. One effort in this regard is Hibar et al. (2013). Envi- 
ronmental factors may also modify genetic susceptibility in the con- 
text of complex traits and disorders (De Jager et al., 2009; Morrison 
et al., 2007; Oj et al., 2012). Nonetheless, the general approach pro- 
posed here could also be extended to include any known environ- 
mental risk factors that might modulate disease risk (such as 
educational level, exercise, and alcohol intake). 

Allele frequencies in the general population may also raise some 
practical issues for implementing this approach. In our sample, the 
risk alleles for PICALM and CLU are relatively common (major allele fre- 
quencies of 0.74 and 0.72, respectively), but that of CRl is lower (minor 
allele frequency of 0.18). Some studies report a higher MAF of 0.26 for 
the CRl SNP (Jun et al., 2010), but our MAF estimate is consistent with 
those reported in initial GWAS studies in Caucasian subjects (Lambert 
et al., 2009). Uncommon genotypes may cause a trial to be more expen- 
sive if a large number of subjects need to be screened (Schork and 
Topol, 2010). The cost/benefit depends on the cost of screening proce- 
dures relative to neuroimaging and longitudinal assessment, which 
may be much greater. In addition, pharmacogenomic profiling intro- 
duces ethical dilemmas, which complicate its application to drug trials; 
specifically some people may not have access to a new treatment as 
quickly as others, if genetic tests are among the criteria required to ob- 
tain it (Issa, 2002). Nevertheless, as risks and benefits of using a 
multilocus genetic profiling are weighed for specific trials and more is 
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understood about our candidate risk genes and their roles in AD pathol- 
ogy, our results may have important implications for the design of AD 
clinical trials. At the very least, genotyping could be done retrospective- 
ly, to see if effects would have been more powerful if obtained from a 
pre-selected sample. 
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